Density Based Script Identification of a Multilingual Document Image

نویسندگان

  • Rumaan Bashir
  • S. M. K. Quadri
چکیده

Automatic Pattern Recognition field has witnessed enormous growth in the past few decades. Being an essential element of Pattern Recognition, Document Image Analysis is the procedure of analyzing a document image with the intention of working out the contents so that they can be manipulated as per the requirements at various levels. It involves various procedures like document classification, organizing, conversion, identification and many more. Since a document chiefly contains text, Script Identification has grown to be a very important area of this field. A Script comprises the text of a document or a manuscript. It is a scheme of written characters and symbols used to write a particular language. Languages are written using scripts, but script itself is made up of symbols. Every language has its own set of symbols used for writing it. Sometimes different languages are written using the same script, but with marginal modification. Script Identification has been performed for unilingual, bilingual and multilingual document images. But, negligible work has been reported for Kashmiri script. In this paper, we are analyzing and experimentally testing statistical approach for identification of Kashmiri script in a document image along with Roman, Devanagari & Urdu scripts. The identification is performed on offline machine-printed scripts and yields promising results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Script Identification from Printed Document Images Using Statistical Features

Automatic identification of a script in a document image facilitates many important applications such as automatic archiving of multilingual documents; searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this work a technique for script identification from document images is proposed. The method uses vertical and horizontal...

متن کامل

Script Identification for Document Image Retrieval: A Survey

In recent years there are many multimedia documents captured and stored with the advances in computer technology and hence the demand for recognizing and retrieval of such documents has increased tremendously .In such environment the large volume of data and variety of scripts make manual identification unworkable. In such cases the ability to automatically determine the script ,and further the...

متن کامل

Script and Language Identification for Document Images and Scene Texts

In recent times, there have been an increase in Optical Character Recognition (OCR) solutions for recognizing the text from scanned document images and scene-texts taken with the mobile devices. Many of these solutions works very good for individual script or language. But in multilingual environment such as in India, where a document image or scene-images may contain more than one language, th...

متن کامل

Offline Handwritten Script Identification in Document Images

Automatic handwritten script identification from document images facilitates many important applications such as sorting, transcription of multilingual documents and indexing of large collection of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate a texture as a tool for determining the script of handwritten document image, based on the observa...

متن کامل

Script Identification from Indian Documents

Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this paper, we present a scheme to identify different Indian scripts from a document image. This scheme employs hie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014